29  Lake thermal structures I

Learning Objectives

After completing this tutorial you should be able to

Download the directory for this project here, make sure the directory is unzipped and move it to your bi328 directory. You can open the Rproj for this module either by double clicking on it which will launch Rstudio or by opening Rstudio and then using File > Open Project or by clicking on the Rproject icon in the top right of your program window and selecting Open Project.

There should be a file named 29_thermal-structure-I.qmd in that project directory. Use that file to work through this tutorial - you will hand in your rendered (“knitted”) quarto file as your homework assignment. So, first thing in the YAML header, change the author to your name. You will use this quarto document to record your answers. Remember to use comments to annotate your code; at minimum you should have one comment per code set1 you may of course add as many comments as you need to be able to recall what you did. Similarly, take notes in the document as we discuss discussion/reflection questions but make sure that you go back and clean them up for “public consumption”.

  • 1 You should do this whether you are adding code yourself or using code from our manual, even if it isn’t commented in the manual… especially when the code is already included for you, add comments to describe how the function works/what it does as we introduce it during the participatory coding session so you can refer back to it.

  • We are going to have to install a few specialized packages for this module. We will be using devtools::install_github() which allows us to install a repository directly from github2.

  • 2 Typically we have been installing from CRAN

  • # install devtools if you have not already
    install.packages("devtools")
    
    # this will likely take a few minutes
    # if successful will say "DONE (GLMr)" at end of output
    devtools::install_github("CareyLabVT/GLMr", force = TRUE)
    
    # this will also take a few minutes
    # you might get a pop up asking if you want to compile from source
    devtools::install_github("CareyLabVT/glmtools", force = TRUE)

    Now we are ready to read in all the packages we will need.

    # load libraries ----
    
    # reporting
    library(knitr)
    
    # visualization
    library(ggplot2)
    library(ggthemes)
    library(patchwork)
    
    # data wrangling
    library(dplyr)
    library(tidyr)
    library(readr)
    library(skimr)
    library(janitor)
    library(magrittr)
    library(lubridate)
    
    # General Lake Model
    library(GLMr)
    library(glmtools)
    
    # set other options ----
    options(scipen=999)
    
    knitr::opts_chunk$set(
      tidy = FALSE, 
      message = FALSE,
        warning = FALSE)

    29.1 Macrosystems ecology

    Consider this

    Briefly describe the central questions explore in macrosystems ecology and outline some of the challenges to the field.

    Did it!

    [Your answer here]

    29.2 The power of simulated data

    Consider this

    Describe benefits and challenges of using simulated to investigate complex questions, especially in the context of global change ecology.

    Did it!

    [Your answer here]

    In this module, we are going to set up a lake model to explore how different climate scenarios will impact the thermal structure of a lake.

    29.3 Get to know the model

    We are going to start off by plotting water temperature data using a default model lake and real climate forcing data to get to know the model.

    We are going to use additional scripts/files that will allow you to tell the General Lake Model (GLM) how you want the model to be run. The GLM is not written in R, rather we have wrapper functions written in R that allow us to interface with the model. The advantage of doing this is that we can run the model and then interpret the output in the same environment.

    We will use *.nml scripts to run the GLM simulations. This script sets all the initial conditions of the lake we want to simulate, i.e. id defines the parameters. It includes various sections including glm_setup, morphometry, and metereology.

    We are going to run all of our simulations in separate directories that will contain input files/scripts defining parameters, all output will also be written to that directory. If you check the file structure for our project directory you will notice that we have a folder called sim. This already contains a file called glm2.nml for a default lake. We will create a vector with our path, that way we won’t constantly have to type in the path, instead we can use the object name.

    # set path for nml file
    nml_file <- "sim/glm2.nml"
    
    # read in nml file
    nml <- read_nml(nml_file)
    
    # see contents
    print(nml)
    &glm_setup
       sim_name = 'GLMSimulation'
       max_layers = 500
       min_layer_vol = 0.025
       min_layer_thick = 0.5
       max_layer_thick = 1.5
       Kw = 0.6
       coef_mix_conv = 0.125
       coef_wind_stir = 0.23
       coef_mix_shear = 0.2
       coef_mix_turb = 0.51
       coef_mix_KH = 0.3
       coef_mix_hyp = 0.5
    /
    &morphometry
       lake_name = 'AwesomeLake'
       latitude = 32.82
       longitude = 35.59
       bsn_len = 21000
       bsn_wid = 13000
       bsn_vals = 45
       H = -252.9, -251.9, -250.9, -249.9, -248.9, -247.9, -246.9, -245.9, -244.9, -243.9, -242.9, -241.9, -240.9, -239.9, -238.9, -237.9, -236.9, -235.9, -234.9, -233.9, -232.9, -231.9, -230.9, -229.9, -228.9, -227.9, -226.9, -225.9, -224.9, -223.9, -222.9, -221.9, -220.9, -219.9, -218.9, -217.9, -216.9, -215.9, -214.9, -213.9, -212.9, -211.9, -208.9, -207.9, -203.9
       A = 0, 9250000, 15200000, 17875000, 21975000, 26625000, 31700000, 33950000, 38250000, 41100000, 46800000, 51675000, 55725000, 60200000, 64675000, 69600000, 74475000, 79850000, 85400000, 90975000, 96400000, 102000000, 107000000, 113000000, 118000000, 123000000, 128000000, 132000000, 136000000, 139000000, 143000000, 146000000, 148000000, 150000000, 151000000, 153000000, 155000000, 157000000, 158000000, 160000000, 161000000, 162000000, 167000000, 170000000, 173000000
    /
    &time
       timefmt = 2
       start = '2000-03-31 00:00:00'
       stop = '2001-01-01 00:00:00'
       dt = 3600
       timezone = 2
    /
    &output
       out_dir = '.'
       out_fn = 'output'
       nsave = 24
       csv_lake_fname = 'lake'
       csv_outlet_allinone = .false.
       csv_outlet_fname = 'outlet_'
       csv_outlet_nvars = 3
       csv_outlet_vars = 'flow','temp','salt'
       csv_ovrflw_fname = 'overflow'
    /
    &init_profiles
       lake_depth = 41
       num_depths = 5
       the_depths = 1, 10, 20, 30, 40
       the_temps = 18, 18, 18, 18, 18
       the_sals = 0.5, 0.5, 0.5, 0.5, 0.5
    /
    &meteorology
       met_sw = .true.
       lw_type = 'LW_IN'
       rain_sw = .false.
       atm_stab = .false.
       catchrain = .false.
       rad_mode = 1
       albedo_mode = 1
       cloud_mode = 4
       subdaily = .true.
       meteo_fl = 'met_hourly.csv'
       wind_factor = 1
       rain_factor = 1
       sw_factor = 1
       lw_factor = 1
       at_factor = 1
       rh_factor = 1
       ce = 0.0013
       ch = 0.0013
       cd = 0.0013
       rain_threshold = 0.01
       runoff_coef = 0.3
    /
    &bird_model
       AP = 973
       Oz = 0.279
       WatVap = 1.1
       AOD500 = 0.033
       AOD380 = 0.038
       Albedo = 0.2
    /
    &inflow
       num_inflows = 0
    /
    &outflow
       num_outlet = 0
    /
    &snowice
       snow_albedo_factor = 1
       snow_rho_max = 300
       snow_rho_min = 50
    /
    &sed_heat
       sed_temp_mean = 9.7
       sed_temp_amplitude = 2.7
       sed_temp_peak_doy = 242.5
    /

    Let’s take a look at the various values that have been set for different parameters that determine how we will be running our simulation.

    First, let’s find out what our lake is called.

    # query lake name
    get_nml_value(nml, "lake_name")
    [1] "AwesomeLake"
    Give it a whirl

    Use the get_nml_value() function to query additional parameters that have been set in your *.nml file. You can do this by changing the argument lake_name to lake_depth, latitude, and longitude.

    Did it!

    [Your answer here]

    Now, let’s take a closer look at the meteorological input data for the duration of our simulation.

    plot_meteo(nml_file)

    Figure 29.1: Metereological data describing drivers of thermal structure of Lake Awesome.
    Consider this

    Briefly describe what meteorological data has been defined for this simulation run and argue why these are important parameters to know to accurately model the thermal structure of a Lake.

    ::: {.callout-important icon=false}

    29.4 Did it!

    [Your answer here]

    :::`

    Consider this

    Use the information you’ve pulled from the nml file to describe where your lake is located and what the maximum depth is.

    This lake is based on a real lake. Figure out where this lake is located (what the name of the lake is) and argue whether you think the meteorological data is reasonable for the model lake.

    Did it!

    [Your answer here]

    29.5 Run simulation & compare to real data

    Because this is a real lake we also have observed data from sensors in the lake.

    Consider this

    Outline the importance of being able to assess the quality of a model and simulated data using observed data.

    Did it!

    [Your answer here]

    Now, let’s get to the good stuff and actually run the simulation; to run this simulation it will pull data from the nml file describing the lake along with information on drivers from the met_hourly.csv file in your sim folder.

    # run model
    run_glm("sim/", verbose = TRUE)
    [1] 0

    If you check your sim directory, you should now see output files that have been added (check the modified dates).

    The important files is output.nc which contains all the output data from our simulation. Again, we’ll create a vector with the path to the output file. This is our baseline simulated data based on observed conditions recorded in the met_hourly.csv file.

    baseline <- file.path("sim", 'output.nc')

    Now, let’s pull the data describing daily surface water temperatures in the lake during our baseline simulation and store it in a data.frame so we can compare it to our climate scenario down the road.

    # create data frame with surface temperature
    surf_temp <- get_var(file = baseline, "temp", reference = "surface", z_out = c(1)) %>%
      rename(Baseline_Temp = temp_1)

    For now, let’s visualize our simulated water temperatures using a heat map.

    plot_temp(file = baseline, fig_path = FALSE)

    Figure 29.2: Thermal structure for our simulated lake.
    Consider this

    Briefly describe how the data is encoded in this figure and how it should be interpreted.

    Did it!

    [Your answer here]

    Consider this

    Describe the seasonal pattern of thermal structure in this lake across seasons. Be specific, e.g. use the color gradient scale to identify when and at which layers the lake is the warmest/coldest, what maximum and minimum temperatures are.

    Did it!

    [Your answer here]

    In your sim filder there is an additional filed called field_data.csv contains the observed data for buoys in the lake. Let’s compare how modeled data (baseline) compares to the observed field data for our lake.

    # define observed field data
    field_file <- file.path("sim", "field_data.csv")
    
    # plot simulated vs observed data
    plot_temp_compare(baseline, field_file)

    Figure 29.3: Comparison of field data and simulated temperatures.

    The top panel shows the observed data measured at different depths and times using high-frequency sensors on a buoy, each measurement is depicted by black circles. The heatmap represents the extrapolated values between sensors.

    The bottom panel shows the modeled data.

    Consider this

    Briefly assess how well the model replicates the real-world lake (explain how you are coming to this conclusion); include information on the water temperatures and thermocline depths in your comparison.

    Did it!

    [Your answer here]

    We can also compare different physical lake characteristics between the simulated and observed lake.

    First, let’s take a look at what metrics we can compare.

    sim_metrics(with_nml = FALSE)
    [1] "buoyancy.freq"     "center.buoyancy"   "thermo.depth"     
    [4] "water.density"     "water.temperature"

    We can extract the observed and simulated thermocline depths for these metrics and compare them to assess the quality of our model.

    Let’s start with pulling out information on the thermocline depths and return them as a data.frame.

    therm_depths <- compare_to_field(baseline, field_file, metric = "thermo.depth", 
                                     as_value = TRUE, na.rm = TRUE)
    
    head(therm_depths)
                  DateTime      obs       mod
    7  2000-04-06 01:00:00 11.78483  3.409255
    8  2000-04-07 01:00:00 11.79451  8.287353
    9  2000-04-08 01:00:00 14.53049 20.472547
    10 2000-04-09 01:00:00 15.72452 21.391872
    12 2000-04-11 01:00:00 19.94560 26.142652
    13 2000-04-12 01:00:00 19.94391  3.153735
    Give it a whirl

    Plot the observed vs. simulated thermocline depths to compare the model and observed data.

    Did it!

    [Your answer here]

    This can be done in a straight forward way using a line plot.

    Figure 29.4: Comparison of modeled and observed thermoclines.
    Give it a whirl

    Extract the observed and model water temperatures into a data.frame called water_temp and create a plot to compare them.

    Did it!

    [Your answer here]

    Before you plot your data take a look at the data set - you’ll notice that there are multiple observations for the same time point.

    Let’s use a scatter plot instead of a line plot to make sure we can see all the data.

    water_temp <- compare_to_field(baseline, field_file, metric = "water.temperature",
                                   as_value = TRUE, na.rm = TRUE)
    
    water_temp %>%
      pivot_longer(names_to = "dataset", values_to = "water_temp", 2:3) %>%
      ggplot(aes(x = DateTime, y = water_temp, color = dataset)) +
      geom_point(shape = 1) +
      scale_color_manual(values = c("darkblue", "darkred")) +
      labs(x = "Date", y = "Water temperature [C]")

    Figure 29.5: Comparison of modeled and observed water temperatures.
    Give it a whirl

    Extract the observed and model water densities into a data.frame called water_density and create a plot to compare them.

    Did it!

    [Your answer here]

    Consider this

    Explain why water temperature and water densities have multiple data points for each day. Compare and contrast the patterns you observe between the two plots and explain the mechanism linking the two metrics.

    Define the terms epilimnion, hypolimnion, thermocline, stratification, and isothermal. Use these terms to describe the seasonal pattern of water temperatures and stratification.

    Did it!

    [Your answer here]

    Consider this

    Use your comparisons of observed and modeled data for various metrics (thermocline, water temperatures, water density) to assess the quality of our lake model and argue whether or not our modeled output is a good representation of the observed lake water temperatures.

    Did it!

    [Your answer here]

    29.6 Acknowledgments

    These activities are based on the EDDIE Climate Change Effects on Lake Temperatures module.3

  • 3 Carey, C.C., S. Aditya, K. Subratie, V. Daneshmand, R. Figueiredo, and K.J. Farrell. 24 August 2020. Macrosystems EDDIE: Climate Change Effects on Lake Temperatures. Macrosystems EDDIE Module 1, Version 2.